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Abstract 

We present a new analysis for QuickHeapsort splitting it into the analysis 
of the partition-phases and the analysis of the heap-phases. This enables 
us to consider samples of non-constant size for the pivot selection and 
leads to better theoretical bounds for the algorithm. 

Furthermore we introduce some modifications of QuickHeapsort, both 
in-place and using n extra bits. We show that on every input the ex- 
pected number of comparisons is nlgn — 0.03n + o(n) (in-place) respec- 
tively n lg n — 0.997n + o(n). Both estimates improve the previously known 
best results. (It is conjectured [19] that the in-place algorithm Bottom- 
Up-Heapsort uses at most n lg n + 0.4n on average and for Weak-Heapsort 
which uses n extra-bits the average number of comparisons is at most 
nlgn — 0.42n [8].) Moreover, our non-in-place variant can even com- 
pete with index based Heapsort variants (e.g. Rank-Heapsort |17| ) and 
Relaxed- Weak-Heapsort (nlgn — 0.9n + o(n) comparisons in the worst 
case) for which no C(n)-bound on the number of extra bits is known. 

Keywords. In-place sorting - heapsort - quicksort - analysis of algo- 
rithms 

ACM classification: F.2.2 Nonnumerical Algorithms and Problems 



1 Introduction 

QuickHeapsort is a combination of Quicksort and Heapsort which was first de- 
scribed by Cantone and Cincotti [2]. It is based on Katajainen's idea for Ul- 
timate Heapsort [T2] . In contrast to Ultimate Heapsort it does not have any 
0{n\gn) bound for the worst case running time. Its advantage is that it is very 
fast in the average case and hence not only of theoretical interest. 

Both algorithms have in common that first the array is partitioned into two 
parts. Then in one part a heap is constructed and the elements are successively 
extracted. Finally the remaining elements are treated recursively. The main 
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advantage of this method is that for the sift-down only one comparison per 
level is needed, whereas standard Heapsort needs two comparisons per level (for 
a description of standard Heapsort see some standard textbook, e.g. [(J ). This 
is a severe drawback and one of the reasons why standard Heapsort cannot 
compete with Quicksort in practice (of course there are also other reasons like 
cache behavior). Over the time a lot of solutions to this problem appeared like 
Bottom- Up- Heapsort [TO] or MDR- Heapsort [KJ HE] , which both perform the 
sift-down by first going down to some leaf and then searching upward for the 
correct position. Since one can expect that the final position of some introduced 
element is near to some leaf this is a good heuristic and it leads to provably good 
results. 

The difference between QuickHeapsort and Ultimate Heapsort lies in the 
choice of the pivot element for partitioning the array. While for Ultimate Heap- 
sort the pivot is chosen as median of the whole array, for QuickHeapsort the 
pivot is selected as median of some smaller sample (e.g. as median of 3 elements) 
or by some other method. 

In [5] the basic version with fixed index as pivot is analyzed and - together 
with the median of three version - implemented and compared with other Quick- 
and Heapsort variants. In [5] Edelkamp and Stiegeler compare these variants 
with so called Weak- Heapsort [7], some modifications of it (e.g. Relaxed- Weak- 
Heapsort) and Quick- Weak-Heapsort which relies on the same idea as Quick- 
Heapsort but uses Weak-Heaps instead of normal heaps. Weak-Heapsort and 
Quick- Weak- Heapsort beat basic QuickHeapsort with respect to number of com- 
parisons, however they need 0(n) bits extra-space (for Relaxed- Weak- Heapsort 
this bound is only conjectured), hence are not in place. 

We split the analysis of QuickHeapsort into three parts: the partitioning 
phases, the heap construction and the heap extraction. This allows us to get bet- 
ter bounds for the running time, especially when choosing the pivot as median of 
a larger sample. It also simplifies the analysis. We introduce some modifications 
of QuickHeapsort, too. The first one is in-place and needs nlgn — 0.03n + o(n) 
comparisons on average what is to the best of our knowledge better than any 
other known in-place Heap- and Quicksort variant. We also examine a modifi- 
cation using 0(n) bits extra-space, which applies the ideas of MDR-Heapsort 
to QuickHeapsort. With this method we can bound the average number of 
comparisons to nlgn — 0.997n + o(n). Actually, a complicated, iterated in-place 
Mergelnsertion uses only nlgn— 1.3n+C>(lgn) comparisons, [TB]. Unfortunately, 
for practical purposes this algorithm is not competitive. 

Our contributions are as follows: 1. We give a simplified analysis which gives 
better bounds than previously known. 2. Our approach yields the first precise 
analysis of QuickHeapsort when the pivot element is taken from a larger sample. 
3. We give a simple in-place modification of QuickHeapsort which saves 0.75n 
comparisons. 4. We give a modification of QuickHeapsort using n extra bits 
only and we can bound the expected number of comparisons. This bound is 
better than the previously known for the worst case of Heapsort variants using 
O(nlgn) extra bits for which best and worst case are almost the same. 5. We 
have implemented QuickHeapsort, and our experiments confirm the theoretical 
predictions. 

The paper is organized as follows: Sect. [2] briefly describes the basic Quick- 
Heapsort algorithm together with our first improvement. In Sect. [3] we analyze 
the expected running time of QuickHeapsort. Then we introduce some improve- 
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ments in Sect, fallowing 0(n) additional bits. Finally, in Sect. [SJ we present 
our experimental results comparing the different versions of QuickHeapsort with 
other Quicksort and Hcapsort variants. 

2 QuickHeapsort 

A tw o- lay er-min- heap is an array of n elements together with a partition 

(G, R) of {1, . . . , n} into green and red elements such that for all g € G, r 6 R we 
have A[g] < A[r]. Furthermore, the green elements g satisfy the heap condition 
A[g] < mm{A[2g],A[2g+l]}, and if g is red, then 2g and 2g+l are red, too. (The 
conditions are required to hold, only if the indices involved are in the range of 1 
to 7i.) The green elements are called "green" because the they can be extracted 
out of the heap without caution, whereas the "red" elements are blocked. Two- 
layer-max-heaps are defined analogously. We can think of a two-layer-heap as 
rooted binary tree such that each node is either green or red. Green nodes 
satisfy the standard heap-condition, children of red nodes are red. Two-layer- 
heaps were defined in [T2]. In [2J for the same concept a different language is 
used (they describe the algorithm in terms of External Heapsort). Now we are 
ready to describe the QuickHeapsort algorithm as it has been proposed in [2]. 
Most of it also can be found in pseudocode in App. [E] 

We intend to sort an array A[l..n]. First, we choose a pivot p. This is the 
randomized part of the algorithm. Then, just as in Quicksort, we rearrange the 
array according to p. That means, using n — 1 comparisons the partitioning 
function returns an index k and rearranges the array A so that A[i] > A[k] for 
i < k, A[k] = p, and A[k] > A[j] for k < j. After the partitioning a two-layer- 
heap is built out of the elements of the smaller part of the array, either the part 
left of the pivot or right of the pivot. We call this smaller part heap- area and 
the larger part work- area. More precisely, if k — 1 < n — k, then {1, . . . , k — 1} 
is the heap-area and {k + 1, . . . ,n} is the work-area. If fc — 1 > n — k, then 
{1, . . . , k — 1} is the work-area and {k + 1, . . . , n} is the heap-area. Note that 
we know the final position of the pivot element without any further comparison. 
Therefore, we do not count it to the heap-area nor to the work-area. If the 
heap-area the part of the array left of the pivot, a two-layer-max-heap is built, 
otherwise a two-layer-min-heap is built. 

At the beginning the heap-area is an ordinary heap, hence it is a two-layer- 
heap consisting of green elements, only. Now the heap extraction phase starts. 
Let m denote the size of the heap-area. If we are in the case of a max-heap, 
these m elements are moved to the back of the array, in the case of a min-heap 
they are moved to the front of the array (which is in both cases the work-area) . 
For a max-heap, the extraction of one element works as follows: the root of the 
heap is placed at the current position of the work-area (which at the beginning 
is its last position). Then, starting from the root the resulting "hole" is trickled 
down: always the larger child is moved up into the vacant position and then this 
child is treated recursively. This stops as soon as a leaf is reached. We call this 
the SpecialLeaf procedure (Alg. I5.2[ ) according to [2 . Now, the element which 
before was at the current position in the work-area is placed as red element 
in this hole at the leaf in the heap-area. Finally the current position in the 
work-area is moved by one and the next element can be extracted. 

The procedure sorts correctly, because after the partitioning it is guaranteed 
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that all red elements are smaller than all green elements. Furthermore there is 
enough space in the work-area to place all green elements of the heap, since the 
heap is always the smaller part of the array. After extracting all green elements 
the pivot element it placed at its final position and the remaining elements are 
sorted recursively. 

Actually we can improve the procedure, thereby saving 3n/4 comparisons by 
a simple trick. Before the heap extraction phase starts in the heap-area with m 
elements, we perform at most additional comparisons in order to arrange 
all pairs of leaves which share a parent such that the left child is not smaller 
than its right sibling. Now, in every call of SpecialLeaf, we can save exactly one 
comparison, since we do not need to compare two leaves. For a max-heap we 
only need to move up the left child and put the right one at the place of the 
former left one. Summing up over all heaps during an execution of standard 
QuickHcapsort, we invest comparisons in order to save n comparisons, 

where t is the number of recursive calls. The expected number of t is in O(lgn). 
Hence, we can expect to save + O(lgn) comparisons. We call this version 
the improved variant of QuickHcapsort. 



3 Analysis of QuickHeapsort 

This section contains the main contribution of the paper. We analyze the num- 
ber of comparisons. 

Throughout the logarithm lg is always meant to base 2. By n we denote 
the number of elements of an array to be sorted. We use standard O-notation 
where 0{g), o(g), and u(g) denote classes of functions. In our analysis we do 
not assume any random distribution of the input, i.e. it is valid for every per- 
mutation of the input array. Randomization is used however for pivot selection. 
With Pr[e] we denote the probability of some event e. The expected value of 
a random variable T is denoted by E[T]. 

The number of assignments is bounded by some small constant times the 
number of comparisons. Let T(n) denote the number of comparisons during 
QuickHcapsort on a fixed array of n elements. We are going to split the analysis 
of QuickHeapsort into three parts: 

1. Partitioning with an expected number of comparisons E[T part (n)] (aver- 
age case). 

2. Heap construction with at most T con (n) comparisons (worst case). 

3. Heap extraction (sorting phase) with at most T ext (n) comparisons (worst 
case). 

We analyze the three parts separately and put them together at the end. The 
partitioning is the only randomized part of our algorithm. The expected num- 
ber of comparisons depends on the selection method for the pivot. For the 
expected number of comparisons by QuickHeapsort on the input array we ob- 
tain E[T(n) ] < T con (n) + T cxt (n) + E[T part (n)]. 

Theorem 3.1 Let f e w(l) H o(n) with 1 < f(n) < n, e.g., f(n) = Ign, and 
let E[T(n) ] be the expected number of comparisons by QuickHeapsort on a fixed 
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input array of size n. Choosing the pivot as median of f(n) randomly selected 
elements in time 0(f(n)), we have 

E[T(n) ] < ralgn + 0.72n + o(n) 

E[T(n) ] < nigra — 0.03n + o(n) improved variant. 

The significance of the selection method for the pivot becomes evident, if we 
compare Thm. [3~Tl with bounds one can derive for simpler selection methods as 
used in Prop. 13.21 

Proposition 3.2 Choosing the pivot a.) at random, respectively b.) as median- 
of-three, we have the following bounds 

a. ) E[T(n)} < nign + 2.72n + o(ra), E[ T(n) ]< nig n + 1.97n + o{n) impr. 

b. ) E[T(n)] < ralgn + 1.92ra + o(ra), E[ T(ra) ] < n lg ra + 1.17ra + o(ra) impr. 

The proof of these results are postponed to Sect. 13.31 Note that it is enough to 
prove the results without the improvement, since the difference is always 0.75ra. 

3.1 Heap Construction 

The standard heap construction [9] needs at most 2m comparisons to construct 
a heap of size m in the worst case and approximately 1.88m in the average case. 
For the mathematical analysis better theoretical bounds can be used. The best 
result we are aware of is due to Chen et al. in ;5|. According to this result we 
have T con (m) < 1.625rai + o(rn). 

Earlier results are of similar magnitude, by [4] it has been known that 
Icon(fi) < 1.632m + o(m) and by |10j it has been known T con (m) < 1.625m + 
o(m), but Gonnet and Munro used 0(m) extra bits to get this result, whereas 
the new result of Chen et al. is in-place (by using only O(lgm) extra bits). 

During the execution of QuickHeapsort over n elements, every element is part 
of a heap only once. Hence, the sizes of all heaps during the entire procedure 
sum up to ra. With the result of [5] the total number of comparisons performed 
in the construction of all heaps satisfies: 

Proposition 3.3 

T CO n(n) < 1.625ra + o(ra). 

3.2 Heap Extraction 

For a real number rel with r > we define {r} by the following condition 

r = 2 k + {r} with k € Z and < {r} < 2 fc . 

This means that 2 k is largest power of 2 which is less than or equal to r and 
{r} is the difference to that power, i.e. {r} = r — 2L lgr J . In this section we first 
analyze the extraction phase of one two-layer-heap of size m. After that, we 
bound the number of comparisons T ext (n) performed in the worst case during 
all heap extraction phases of one execution of QuickHeapsort on an array of size 
n. Thm. 13.41 is our central result about heap extraction. 
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Theorem 3.4 



Text(n) <n-(LlgnJ - 3) + 2{n} + 0(lg 2 n). 



The proof of Thm. covers the almost rest of Section [XU In the following, 
the height height(w) of an element v in a heap H is the maximal distance from 
the that node to a leaf below it. The height of H is the height of its root. The 
level level(u) of v to be its distance from the root. 

In this section we want to count the comparisons during SpecialLeaf proce- 
dures, only. Recall that a SpecialLeaf procedure is a cyclic shift on a path from 
the root down to some leaf, and the number comparisons is exactly the length 
of this path. Hence the upper bound is the height of the heap. But there is a 
better analysis. 

Let us consider a heap with to green elements which are all extracted by 
SpecialLeaf procedures. The picture is as follows: First, we color the green 
root red. Next, we perform a cyclic shift defined by the SpecialLeaf procedure. 
In particular, the leaf is now red. Moreover, red positions remain red, but 
there is exactly one position v which has changed its color from green to red. 
This position v is on the path defined by the SpecialLeaf procedure. Hence, 
the number of comparisons needed to color the position v red is bounded by 
height(u) + level(-u). 

The total number of comparisons E(m) to extract all to elements of a Heap 
H is therefore bounded by 



We have height (//") - 1 < height(w) + level (u) < height(TJ) = |lg™| for all 
v £ H. We now count the number of elements v where height(u) + level(z;) = 
[lgrnj and the number of elements v where height (v) + level(u) = [lgTOj — !• 
Since there are exactly {to} + 1 nodes of level |lg toJ , there are at most 2 {to} + 
1 + lg to elements v with height(u) + level(u) = |_lg toJ . All other elements satisfy 
height(u) + level (v) — [lg m J — L We obtain 



Note that this is an estimate of the worst case, however this analysis also shows 
that the best case only differs by C(lg m)-terms from the worst case. 

Now, we want to estimate the number of comparisons in the worst case 
performed during all heap extraction phases together. During QuickHeapsort 
over n elements we create a sequence Hi , . . . , H t of heaps of green elements 
which are extracted using the SpecialLeaf procedure. Let to, = \Hi\ be the size 
of the i-th Heap. The sequence satisfies 2to^ < n — J2j<i m h because heaps are 
constructed and extracted on the smaller part of the array. 

Here comes a subtle observation: Assume that m\ + TO2 < n/2. If we 
replace the first two heaps with one heap H' of size \H\ = mi + m^, then 
the analysis using the sequence H', H3, . . . , H t cannot lead to a better bound. 
Continuing this way, we may assume that we have t £ O(lgn) and therefore 




(1) 



veH 



E(m) < 2 • {to} • [lgrnj + (m - 2 • {m})(|lgTOj - 1) + C(lgm) 
= to • (LlgTOj - 1) + 2 • {to} + O(lgm). 



(2) 
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Ei<i<t OQgrm) C C(lg n). With Eq. © we obtain the bound 



TcxtH < J2 E( - m ^ = [J2 m i- Ug^J +2 {m s } -n + 0(lg 2 n). (3) 



;.=i 



Later we will replace the rrii by other positive real numbers. Therefore we 
define the following notion. Let 1 < v e M. We say a sequence £i, ^2, . . . , Xt 
with x; G IR >0 is valid w.r.t. v, if for all 1 < i < t we have: 



< v — 



As just mentioned the initial sequence mi, rri2 ■ ■ ■ , m t is valid w.r.t. n. Let us 
define a continuous function F : M. >0 —> R by 

F(x) = x • |_lgxj + 2{x} . 

It is continuous, since for a; = 2 fc , /c € Z we have F(a;) = a;fc = lim^o^ — 
e)(fc— l) + 2 {a; — e}. It is piecewise differentiable with right derivative Llg^J + 2- 
Therefore: 

Lemma 3.5 Let x > y > 5 > 0. Then we have the inequalities: 

F(x) + F(y) < F(x + 5)+ F(y - S) and F(x) + F{y) < F(x + y). 

Lemma 3.6 Let 1 < v G M. For all sequences xi, x%, ■ ■ ■ , Xt with xi £ R >0 , 
which are valid w.r.t. v, we have: 

t Us "J 



i=l i=l 



Proof. The result is true for v < 2, because then F(xi) < F{v/2) < F(l) = 
for all i. Thus, we may assume v > 2. We perform induction on t. For t = 1 
the statement is clear, since lg^ > 1 and Xi < v/2. Now let t > 1. By Lem. 13.51 
we have F{xi) + F(x2) < F{xi + x^)- Now, if Xi + X2 < then the sequence 
xi + X2,X3, . . . , Xt is valid, too; and we are done by induction. Hence, we may 
assume xi + X2 > If Xi < Xi, then 

2xi = 2X2 + 2(xi — X2) < V — Xi + 2(xi — X2) = V — X2 + Xi — X2 < V — X2- 

Thus, if Xi < X2, then the sequence X2, xi, X3, . . . , x t is valid, too. Thus, it is 
enough to consider xi > X2 with Xi + X2 > § ■ 

We have ^ > 1 and the sequence x' 2 ,x^, . . .Xt with x' 2 = Xi + X2 — ^ is valid 
w.r.t. v/2, because 

, v v — Xl v Xi v 

s, = 2:1 + X2 < X] H = — < — . 

2 12 2 2 2 2~4 



Therefore, by induction on t and Lem. 13.51 we obtain the claim: 

E ^ ^(i//2)+f(4)+x; < ^W2)+ £ f (£) < e F ( J 

i— 1 i—3 i— 2 1 

□ 
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Lemma 3.7 

LlgnJ 

E <F(n)-2n + 0(\gn). 

i=l 

Proof. 

Lig«J Lig™J -, Us "J ■ Lig™J , 

E = »Lignj- E E i+ 2 w- E * 

i=l i—1 i—1 i—1 

< n I len I - Y^ — - n- Y^ — + 2 |nj • — + " , 

— Lb J / j 2» / j <Yb 11 L ' 2 l 2 L lg ™J 

i>l »>1 i>l 

= n|_lgnj -2?i + 2{n} + C(lgn). 

Applying these lemmata to Eq. ^ yields the proof of Thm. 13.41 

Corollary 3.8 We have 

T cxt {n) < nlgn-2.9139n + C(lg 2 n) 

Proof. By [HI Thm. 1] we have F(n) ~2n< nlgn - 1.9139n. Hence, Cor. [3751 
follows directly from Thm. 13.41 □ 

3.3 Partitioning 

In the following T p i vo t(n) denotes the number of comparisons required to choose 
the pivot element in the worst case; and, as before, E[T part (n)] denotes the 
expected number of comparisons performed during partitioning. We have the 
following recurrence: 

n 

E[T part (n) ] < n - 1 + T pimt {n) + E Pr [P ivot = k ~\ ' E[-T pax t(max{fc _ l, n - *})] . 

fe=i 

(4) 

If we choose the pivot at random, then we obtain by standard methods: 

1 " 

E[T part (»] <n-l + --E E [ r part(max{fc-l,n-fc})] < An. (5) 

k=l 

Similarly, if we choose the pivot with the mcdian-of-three, then we obtain: 

E[T part (n)] < 3.2n + 0(lgn). (6) 

The proof of Prop. 13.21 follows from Equations (JS|) and ([5]), Thm. 13.41 and 
Prop. 13.31 Using a growing number of elements (as n grows) as sample for 
the pivot selection, we can do better. Thm. 13.11 follows now from Thm. 13.41 
Prop. GS1 and Thm. EH 



E 

i>0 



i + |_lg n\ 



□ 
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Theorem 3.9 Let f 6 u(l)flo(n) with 1 < f(n) < n. When choosing the pivot 
as median of f(n) randomly selected elements in time 0(f(n)) (e.g. with the 
algorithm of \T^), the expected number of comparisons used in all recursive calls 
of partitioning is in 2n + o{n) . 

Thm. I3~9l is close to a well-known result in O Thm. 5] on Quickselect, see 
Cor. 13.111 Formally speaking we cannot use it directly, because we deal with 
QuickHeapsort, where after partitioning the recursive call is on the larger part. 
Because of that, and for the sake of completeness, we give a proof. Moreover, 
our proof is elementary and simpler than the one in [14] . The key point is 
Lem. 13.101 Its proof is rather standard, see App. |B]for details. 

Lemma 3.10 Let < 5 < If we choose the pivot as median of 2c + 1 
elements such that 2c + 1 < ^, then we have Pr [pivot < % — Sn] < (2c + l)a c 
where a = 4 (A — <S 2 ) < 1. 

Proof of Thm. 13.91 As an abbreviation, we let E(n) = E[T part (n)] be the 
expected number of comparisons performed during partitioning. We are going 
to show that for all e > there is some D £ K. such that 

E{n) < (2 + e)n + D. (7) 

So, we fix some 1 > e > 0. We choose <5 > such that (2 + e)(5 < |. Moreover, for 
this proof let = ^rp- . Positions of possible pivots k with fx — Sn < k < [i + Sn 
form a small fraction of all positions, and they are located around the median. 
Nevertheless, applying Lem. 13.101 with c = f(n) € oj(1) n o(n) yields for all n, 
which are large enough: 

Pr [pivot < n - Sn] < (2/(n) + 1) ■ a f{n) < -U. (8) 

48 

The analogous inequality holds for Pr [ pivot > fj, + Sn]. Because T p i VOt (n) £ 
o(n), we have 

Tpivot(n) < -en. (9) 

for n large enough. Now, we choose no such that Eq. @ and Eq. ^ hold for 
n > no and such that we have (2 + e)6 + < |. We set D = E(n ) + 1. Hence 
for n < no the desired result Eq. (JT]) holds. Now, let n > no- From Eq. ((4]) we 
obtain by symmetry: 

E(n) < n - 1 + T pivot (n) 

+ Pr [pivot =k]-E(k—l) 

k= 5n~] 
n 

+ 2 Y Pr [pivot = k] ■ E(k - 1). 
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Since E is monotone, E(k) can be bounded by the highest value in the respective 
interval: 

1 

< n + —en 

8 

+ Pr [ u — Sn < pivot < fj, + Sn] ■ E ( \ji + 8n\ ) 
+ 2 Pr [ pivot > fx + Sn ] ■ E(n - 1) 

<n+~en + (l - ^ ■ E [\p + Sn\) + 2-^e ■ E(n - 1). 
By induction we assume E(k) < (2 + e)k + D for k < n. Hence: 
E (n) < n + ^en + - ^ ■ ((2 + e) ■ (a + Sn) + D) + • ((2 + e)n + D) 

< n + (2 + e) • | + ^1 + ^ m + 7T7 e ( 2 + + 

V 2 / 8 24 

<2n+l + - + (2 + e)5n + -en + D < (2 + e)n + D. 
2 4 



□ 

Corollary 3.11 ([II]) Let f £ w(l) n o(n) wit/i 1 < /(n) < n. When im- 
plementing Quickselect with the median of f(n) randomly selected elements as 
pivot, the expected number of comparisons is 2n + o(n). 

Proof. In QuickHeapsort the recursion is always on the larger part of the array. 
Hence, the number of comparisons in partitioning for QuickHeapsort is an upper 
bound on the number of comparisons in Quickselect. □ 

In [T3] it is also proved that choosing the pivot as median of O(^fn) elements 
is optimal for Quicksort as well as for Quickselect. This suggests that we choose 
the same value in QuickHeapsort; what is backed by our experiments. 



4 Modifications of QuickHeapsort using Extra- 
space 

In this section we want to describe some modification of QuickHeapsort using 
n bits of extra storage. We introduce two bit-arrays. In one of them (the Com- 
pareArray) - which is actually two bits per element - we store the comparisons 
already done (we need two bits, because there are three possible values - right, 
left, unknown - we have to store). In the other one (the RedGreenArray) we 
store which element is red and which is green. 

Since the heaps have maximum size n/2, the RedGreenArray only requires 
n/2 bits. The CompareArray is only needed for the inner nodes of the heaps, 
i.e. length ro/4 is sufficient. Totally this sums up to n extra bits. 

For the heap construction we do not use the algorithms described in Sect. 13. ll 
With the CompareArray we can do better by using the algorithm of McDiarmid 
and Reed [T5] , 
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The heap construction works similarly to Bottom-Up- Hcapsort, i.e. the array 
is traversed backward calling for all inner positions i the Reheap procedure 
on i. The Reheap procedure takes the subheap with root i and restores the 
heap condition, if it is violated at the position i. First, the Reheap procedure 
determines a special leaf using the SpecialLeaf procedure as described in Sect.[2j 
but without moving the elements. Then, the final position of the former root 
is determined going upward from the special leaf (bottom- up-phase). At the 
end, the elements above this final position are moved up towards the root by 
one position. That means that all but one element which are compared during 
the bottom-up-phase, stay in their places. Since in the SpecialLeaf procedure 
these elements have been compared with their siblings, these comparisons can 
be stored in the CompareArray and can be used later. 

With another improvement concerning the construction of heaps with seven 
elements as in [3] the benefits of this array can be exploited even more. 

The RcdGrccnArray is used during the sorting phase, only. Its functionality 
is straightforward: Every time a red element is inserted into the heap, the 
corresponding bit is set to red. The SpecialLeaf procedure can stop as soon 
as it reaches an element without green children. Whenever a red and a green 
element have to be compared, the comparison can be skipped. 

Theorem 4.1 Let f 6 w(l) n o(n) with 1 < f(n) < n, e.g., f(n) — lgn, and 
let E[T(n) ] be the expected number of comparisons by QuickHeapsort using the 
CompareArray with the improvement of ^ and the RedGreenArray on a fixed 
input array of size n. Choosing the pivot as median of f(n) randomly selected 
elements in time 0(f(n)), we have 

E[T(n)] < nig n- 0.997n + o(n). (1) 

We can analyze the savings by the two arrays separately, because the Com- 
pareArray only affects comparisons between two green elements, while the Red- 
GreenArray only affects comparisons involving at least one red element. First, 
we consider the heap construction using the CompareArray. With this array 
we obtain the same worst case bound as for the standard heap construction 
method. However, the CompareArray has the advantage that at the end of the 
heap construction many comparisons are stored in the array and can be reused 
for the extraction phase. More precisely: For every comparison except the first 
one made when going upward from the special leaf, one comparison is stored 
in the CompareArray, since for every additional comparison one element on the 
path defined by SpecialLeaf stays at its place. Because every pair of siblings 
has to be compared at one point during the heap construction or extraction, 
all these stored comparisons can be reused. Hence, we only have to count the 
comparisons in the SpecialLeaf procedure during the construction plus § for the 
first comparison when going upward. Thus, we get an amortized bound for the 
comparisons during construction of 

In [3J the notion of Fine-Heaps is introduced. A Fine Heap is a heap with 
the additional CompareArray such that for every node the larger child is stored 
in the array. Such a Fine-Heap of size m can be constructed using the above 
method with 2m comparisons. In 3 Carlsson, Chen and Mattsson showed that 
a Fine-Heap of size m actually can be constructed with only l|m -I- 0(lg 2 m) 
comparisons. That means we have to invest ffm + 0(\g 2 m) for the heap con- 
struction and at the end there are ^ comparisons stored in the array. All these 
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comparisons stored in the array are used later. Summing up over all heaps 
during an execution of QuickHeapsort, we can save another j^n comparisons 
additionally to the comparisons saved by the CompareArray with the result of 
®- 

Hence, for the amortized cost of the heap construction T c ™ ort (i.e. the num- 
ber of comparisons needed to build the heap minus the number of comparisons 
stored in the CompareArray after the construction which all can be reused later) 
we have obtained: 



Proposition 4.2 



T^{n)<^n + o[n) 



This bound is slightly better than the average case for the heap construction 
with the algorithm of |15) which is 1.52n. 

Now, we want to count the number of comparisons we save using the Red- 
GreenArray. We distinguish the two cases that two red elements are compared 
and that a red and a green element are compared. Every position in the heap 
has to turn red at one point. At that time, all nodes below this position are 
already red. Hence, for that element we save as many comparisons as the ele- 
ment is above the bottom level. Summing over all levels of a heap of size m the 
saving results in: 



TO to „ \ -> 

— • H 2 H = to • > i2~ 

i>i 



This estimate is exact up to 0(lgm)-terms. Since the expected number of heaps 
is C(lg n), we obtain for the overall saving the value T save RR(n) = n + C(lg 2 n). 

Another place where we save comparisons with the RedGreenArray is when 
a red element is compared with a green element. It occurs at least one time - 
when the node looses its last green child - for every inner node that we compare 
a red child with a green child. Hence, we save at least as many comparisons 
as there are inner nodes with two children, i.e. at least ^ — 1. Since every 
element - except the expected C(lg n) pivot elements - is part of a heap exactly 
once, we save at least 

Tfl 

T; avc RGM > y +O0gn). 

comparisons when comparing green with red elements. In the average case the 
saving might be even slightly higher, since comparisons can also be saved when 
a node does not loose its last green child. 

Summing up all our savings and using the median of f(n) £ w(l) n o(n) as 
pivot we obtain the proof of Thm. 14.11 

E[T(n) ] < T c a ™ OTt (n) + T ext (n) + E[T part (n)] - T savcRR (n) - T saveRG (n) 

17 3n 

< J^ n + n - (LteraJ -3) + 2{n} + 2n- — + o(n) 

< nlgn — 0.997n + o(n). 
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5 Experimental Results and Conclusion 



In Table [T] we compare the different versions of QuickHeapsort we considered 
in this paper, i.e. the basic version, the improved variant of Sect. [3J and the 
version using bit-arrays (however, without the modification by [3]). We do not 
present actual running times, because the main focus of this paper is on the 
number of comparisons and therefore our implementation is not tuned to min- 
imize running times (in jll) it is shown that in order to minimize the running 
time of Quicksort it can be even better to invest some more comparisons by 
choosing a 'bad' pivot - considering such effects would go far beyond the scope 
of this work) . More results with other pivot selection strategies are in Table [2] 
in App. [C] confirming that a sample size of i/n is optimal for pivot selection 
with respect to the number of comparisons and also that the o(n)-terms in 
Thm. I3.ll and Thm. I3.9I are not too big. For the heap construction we imple- 
mented the normal algorithm due to Floyd [5] as well as the algorithm using 
the extra bit-array (which is the same as in MDR-Heapsort). All algorithms 
are implemented with median of 3 and with median of ^/n elements as pivot. 
We compare them with Quicksort implemented with the same pivot selection 
strategies, Bottom-Up-Heapsort and MDR-Heapsort. In Table Q] we also added 
the values for Relaxed- Weak- Heapsort which were presented in [5] . All the num- 
bers displayed here are average values over 100 runs with random data. As our 
theoretical estimates predict, QuickHeapsort with bit-arrays beats all other vari- 
ants including Relaxed- Weak-Heapsort when implemented with median of *Jri 
for pivot selection (it needs 260584 w 0.26 • 10 6 comparisons less than Relaxed- 
Weak-Heapsort). It also performs 326728 ~ 0.33 • 10 6 comparisons less than 



Sorting algorithm 


Average number of com- 
parisons for n = 10 6 


Basic QuickHeapsort with median of 3 


21327478 


Basic QuickHeapsort with median of -y/n 


20783631 


Improved QuickHeapsort with median of 3 


20639046 


Improved QuickHeapsort with median of i/n 


20135688 


QuickHeapsort with bit-arrays with median 


19207289 


QuickHeapsort with bit-arrays with median 
of *Jn 


18690841 *Bcst result* 


Quicksort with median of 3 


21491310 


Quicksort with median of ^fn 


19548149 


Bottom-Up-Heapsort 


20294866 


MDR-Heapsort 


20001084 


Relaxed- Weak- Heapsort 


18951425 


Lower Bound: lgn! 


18488884 « lg(10 e !) 



Table 1: QuickHeapsort and other algorithms tested on 10 6 elements (the data 
for Relaxed- Weak- Heapsort is taken from [5]). 

our theoretical predictions which are 10 6 • lg(10 6 ) - 0.9139 • 10 6 « 19017569 
comparisons. 

In this paper we have shown that with known techniques QuickHeapsort 
can be implemented with expected number of comparisons less than n lg n — 
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0.03n + o(n) and extra storage O(l). On the other hand, using n extra bits we 
can improve this to nlgn — 0.997n + o(n), i.e. we showed that QuickHcapsort 
can compete with the most advanced Heapsort variants. These theoretical es- 
timates were also confirmed by our experiments. We also considered different 
pivot selection schemes. For any constant size sample for pivot selection, Quick- 
Hcapsort beats Quicksort for large n, since Quicksort has a expected running 
time of w Cnlgn with C > 1. However, when choosing the pivot as median 
of y/n elements (i.e. with the optimal strategy) then our experiments show 
that Quicksort needs less comparisons than QuickHeapsort. However, using bit- 
arrays QuickHeapsort is the winner, again. In order to make the last statement 
rigorous, better theoretical bounds for Quicksort with sampling y/n elements 
are needed. For future work it would also be of interest to prove the optimality 
of \fn elements for pivot selection in QuickHeapsort, to estimate the lower order 
terms of the average running time of QuickHeapsort and also to find an exact 
average case analysis for the saving by the bit-arrays. 
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APPENDIX 



A Proof of Lem. 13.51 



Proof. Since the right derivative is monotonically increasing we have: 

C x+S 



and 



F{x + 8)-F{x) = / F'(t)dt>F'(x)-S = ([lgx]+2)6 

J X 

F(y)-F(y-S)= f F'(t) dt < F'(y) ■ 8 = {\_\gy\ + 2)8 

Jy-S 

This yields: 

F(y) -F(y-S)<( [Ig y\ + 2)8 < ( [lg x\ + 2)6 < F(x + 8)- F(x) 

By adding F(x) + F(y — 8) on both sides we obtain the first claim of Lem. 13.51 
Note that lim e _>o F(e) — 0. Hence the second claim follows from the first by 
considering the limit 8 — > y. □ 

B Proof of Lem. IBTTUI 

Proof. First note that the probability for choosing the k-th element as pivot 
satisfies 

n \ fk — l\ fn — /c x 

• Pr pivot = k \ ~ 
2c+l) Vl 1 

We use the notation of falling factorial x- = x ■ ■ ■ (x — I + 1). Thus, (^) = 

(2c+l)!-(fc-l)£>-fc) £ 
Pr [pivot =k} = , M 2.„2c+i 



(c!) 2 • ni 

2c \(0 i 11 1 c fr (k-l-i)(n-k-i) 
cj [ '(n~2c) 1 = 1 (n-2i-l)(n-2i). 

For k < c we have Pr [pivot = k] =0. So, let c < k < ^ — 8n and let us 
consider an index i in the product with < i < c. 

(k — 1 — i)(n — k — i) (k — i)(n — k — i) 
(n-2i- 2i) ~ (n - 2i)(n-2i) 

((5-0 - (§-*)) • ((5-0 + (5 - *=)) 



(n - 2if 



(n - 2if 



< l (f-(f-H) 2 1 J2 
4 n 2 4 



IG 



We have ( 2 c c ) < 4 C . Since 2c + 1 < §, we obtain: 

Pr [pivot = k] < 4 c (2c+ 1)- — - — rf--^ <(2c+l)-a c . 

(n — 2c) \4 / n 

Now, we obtain the desired result. 

Lf-H 2 

< ^ (2c+l)-a c < (2c+l)a c 
fc=o n 

□ 

C More experimental results 

We also compare the different pivot selection strategies on the basic QuickHeap- 
sort with no modifications. We test sample of sizes of one, three, approximately 
lgn, ^fn^nj lgn, ^/n 1 and nJ for the pivot selection. 

In Table [2] the average number of comparisons and the standard deviations 
are listed. We ran the algorithms on arrays of length 10000 and one million. 
The displayed data is the average resp. standard deviation of 100 runs of Quick- 
Heapsort with the respective pivot selection strategy. 

These results are not very surprising: The larger the samples get, the smaller 
is the standard deviation. The average number of comparisons reaches its min- 
imum with a sample size of approximately y/n elements. One notices that the 
difference for the average number of comparisons is relatively small, especially 
between the different pivot selection strategies with non-constant sample sizes. 
This confirms experimentally that the o(n)-terms in Thm. [3~T1 and Thm. [3~9"l arc 
not too big. 



n 


10 4 


10 b 


Sample size 


Average num- 
ber of compar- 
isons 


Standard 
deviation 


Average num- 
ber of compar- 
isons 


Standard 
deviation 


1 


152573 


4.281 


21975912 


3.452 


3 


146485 


2.169 


21327478 


1.494 


- \gn 


143669 


0.954 


20945889 


0.525 




143620 


0.857 


20880430 


0.352 


- sjnl lg n 


142634 


0.413 


20795986 


0.315 




142642 


0.305 


20783631 


0.281 




147134 


0.195 


20914822 


0.168 



Table 2: Different strategies for pivot selection tested on 10 4 and 10 6 elements. 
The standard deviation of our experiments is given in percent of the average 
number of comparisons 



Pr 



pivot < — — Sn 
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D Some Words about the Worst Case Running 
Time 



Obviously the worst case running time depends on how the pivot element is 
chosen. If just one random element is used as pivot we get the same quadratic 
worst case running time as for Quicksort. However the probability that in 
QuickHcapsort we run in such a "bad case" is not higher than in Quicksort, since 
any choice of pivot elements leading to a worst case scenario in QuickHeapsort 
also yields the worst case for Quicksort. 

If we choose the pivot element as median of approximately 21gn elements, 
we get a worst case running time of O (j^ji i.e. for the worst case it makes 

almost no difference, if the pivot is selected as median of 2 lg n or just as one 
random element. 

However, if we use approximately r^- elements as sample for the pivot se- 
lection, we can get a better bound on the worst case. 

Let / : N — > N>i be some monotonically growing function with / 6 o(n) 
(e.g. f(n) = lgn). We can apply the ideas of the Median of Medians algorithm 
PQ: First we choose random elements, then we group them into groups 
of five elements each. The median of each group can be determined with six 
comparisons [121 p. 215]. Now, the median of these medians can be computed 
using Quickselect. We assume that Quickselect is implemented with the same 
strategy for pivot selection. That means we get the same recurrence relations 
for the worst case complexity of the partitioning-phases in QuickHeapsort and 
for the worst case of Quickselect: 

T{n)=n+ ^) +T {^) +T { n -wk)- 

This yields T(n) < cnf(n) for some c large enough. Hence with this pivot 
selection strategy, we reach a worst case running time for QuickHeapsort of 
nlgn + 0(n/(n)) and - if f(n) € w(l) - average running time as stated in 
Sect. El 

Driving this strategy to the end and choosing f(n) — 1 leads to Ultimate 
Heapsort (or better a slight modification of it - and Quickselect turns into the 
Median of Medians algorithm). Then we have T(n) — nlgn + 0(n) for the 
worst case of QuickHeapsort. However, our bound for the average case does not 
hold anymore. 

In order to obtain an nlgn + 0(n)-bound for the worst case without loosing 
our bound for the average case, we can apply a simple trick: Whenever after the 
partitioning it turns out that the pivot does not lie in the interval {j, . . . , ^p} 
we switch to Ultimate Heapsort. This immediately yields the worst case bound 
of nlgn + 0(n). Moreover, the proof of Thm. 13.91 can easily be changed in 
order to deal with this modification: Let C ■ n be the worst case number of 
comparisons for pivot selection and partitioning in Ultimate Heapsort. We can 
change Eq. (|8} to 

Pr[ pivot — Sn] < — e. 

8C 

Then, the rest of the proof is exactly the same. Hence, Thm. 13.91 and Thm. 13.11 
are also valid when switching to Ultimate Heapsort in the case of a 'bad' choice 
of the pivot. 
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E Pseudocode of basic QuickHeapsort 



Algorithm 5.1 

procedure QuickHcapsort(A[l..n]) 
begin 

if n > 1 then 

p := ChoosePivot; 

k := PartitionReverse(A[l..n], p); 

if k < n/2 then 

TwoLayerMaxHeap(A[l..n], k - 1); 

swap(A[k], A[n - k + 1]); 

QuickHeapsort( J 4[l..n — fc]); 
else 

TwoLayerMinHeap(A[l..n], n — k); 
swap(A[fc], A[n - k + 1]); 
QuickHeapsort(A[(n — fc + 2)..n]); 
endif 
endif 

endprocedure 



The ChoosePivot function returns an element p of the array chosen as pivot. 
The PartitionReverse function returns an index k and rearranges the array A 
so that p = A[k], A[i] > A[k] for i < k and A[i] < A[k] for i > k using n — 1 
comparisons. 



Algorithm 5.2 

function SpecialLeaf(j4[l..m]): 
begin 

i:=l; 

while 2i < m do (* i.e. while i is not a leaf *) 

if 2i + 1 < to and A[2i + 1] > A[2i] then 
:= 4[2i + l]; 
i := 2i + 1; 
else 

:= 4[2i]; 
i := 2z; 
endif 
endwhile 
return i; 
endfunction 



Algorithm 5.3 

procedure TwoLayerMaxHeap(^4[l..n], to) 
begin 

ConstructHeap(^4[l . .to] ) ; 

for i '.= 1 to to do 



(* heap-area: {l..fc — 1} *) 
(# recursion *) 
(* heap-area: {k + l..n} *) 
(* recursion *) 
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temp := A[n — i + 1]; 
j :=SpecialLeaf(A[l..m]); 
A[j] := temp; 
endfor 

endprocedure 



The procedure TwoLayerMinHeap is symmetric to TwoLayerMaxHeap, so we 
do not present its pseudocode here. 
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